Entity ranking using click-log information

نویسندگان

  • Davide Mottin
  • Themis Palpanas
  • Yannis Velegrakis
چکیده

Log information describing the items the users have selected from the set of answers a query engine returns to their queries constitute an excellent form of indirect user feedback that has been extensively used in the web to improve the effectiveness of search engines. In this work we study how the logs can be exploited to improve the ranking of the results returned by an entity search engine. Entity search engines are becoming more and more popular as the web is changing from a web of documents into a “web of things”. We show that entity search engines pose new challenges since their model is different than the one documents are based on. We present a novel framework for feature extraction that is based on the notions of entity matching and attribute frequencies. The extracted features are then used to train a ranking classifier. We introduce different methods and metrics for ranking, we combine them with existing traditional techniques and we study their performance using real and synthetic data. The experiments show that our technique provides better results in terms of accuracy.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Ensemble Click Model for Web Document Ranking

Annually, web search engine providers spend more and more money on documents ranking in search engines result pages (SERP). Click models provide advantageous information for ranking documents in SERPs through modeling interactions among users and search engines. Here, three modules are employed to create a hybrid click model; the first module is a PGM-based click model, the second module in a d...

متن کامل

Empirical Exploitation of Click Data for Query-Type-Based Ranking

In machine-learning-based ranking, each category of queries can be applied with a specific ranking function, which is called query-type-based ranking. Such a divideand-conquer strategy can potentially provide better ranking function for each query categories. A critical problem for the query-type-based ranking is training data insufficiency, which may be solved by using the data extracted from ...

متن کامل

Empirical Exploitation of Click Data for Task Specific Ranking

There have been increasing needs for task specific rankings in web search such as rankings for specific query segments like long queries, time-sensitive queries, navigational queries, etc; or rankings for specific domains/contents like answers, blogs, news, etc. In the spirit of ”divide-andconquer”, task specific ranking may have potential advantages over generic ranking since different tasks h...

متن کامل

Search Engine Click Spam Detection Based on Bipartite Graph Propagation

Using search engines to retrieve information has become an important part of people’s daily lives. For most search engines, click information is an important factor in document ranking. As a result, some websites cheat to obtain a higher rank by fraudulently increasing clicks to their pages, which is referred to as “Click Spam”. Based on an analysis of the features of fraudulent clicks, a novel...

متن کامل

Clickthrough Log Analysis by Collaborative Ranking

Analyzing clickthrough log data is important for improving search performance as well as understanding user behaviors. In this paper, we propose a novel collaborative ranking model to tackle two difficulties in analyzing clickthrough log. First, previous studies have shown that users tend to click topranked results even they are less relevant. Therefore, we use pairwise ranking relation to avoi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Intell. Data Anal.

دوره 17  شماره 

صفحات  -

تاریخ انتشار 2013